8.1Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

Left: ACF for a white noise series of 36 numbers.
Middle: ACF for a white noise series of 360 numbers.
Right: ACF for a white noise series of 1,000 numbers.

(a.) Explain the differences among these figures. Do they all indicate that the data are white noise?

The three figures show randomly generated values within a given range, by the fact that each value is chosen at random (from a sufficiently large pool of potential values) there is no correlation between the current chosen value and the one chose after it, or at any other sample in the set. Because of this sampling method, you would by default consider it white noise.

In looking at the three Autocorrelation Graphs, you can see this to be true as there is no point along any of the plots where a value is followed by a definitely correlated value, although there are a few places where they come close like the 36 sample ACF at the 11th position. And there is likewise no distinct trend either up, down or periodic in the data with a consistent rhythm.

In general the ACF graphs seem to imply that all three sets could be considered white noise.

(b.) Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

Because the critical value is the z-score for \(\sigma^2\) divided by the square root of \(T\) (the number of values in a series) minus \(d\) (the number of differenced lags) as the number of samples increases the critical value decreases narrowing the band in which a point must fall to suggest no autocorrelation.

Because each sample is distinct (meaning the 360 values were sampled from the same pool as the 36, but not in the same order) and there is no formula for pulling a value from the random sample, there is no pattern which should emerge when pulling a set of samples using a random number generator which is truly random. In theory the more samples you pull, the less the series should appear to have a pattern, it becomes more randomly distributed as the series grows.

8.2 A classic example of a non-stationary series is the daily closing IBM stock price series (data set ibmclose).

Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

data("ibmclose")
ibmclose%>%
    autoplot(col='darkorchid')

From the basic plot, you cannot tell 100% for sure that the data is non-stationary, but it is definitely not white noise, which lends itself to the idea that there is some non-stationarity which must be explored using ACF and PACF plots.

ggAcf(ibmclose, col='darkorchid')

The consistent very slow decay of the ACF, which never drops below the critical values, is indicative of a non-stationary times series. So before moving foward with an ARIMA some estimate of differencing it necessary.

ggPacf(ibmclose, col='darkorchid')

The PACF may show non-significant sinusoidal variations, and segnificant spike at lag 1. Th eappropriate action would be to difference the data and revisit the two plots:

ibmclose%>%diff()%>%ggAcf( col='darkorchid')

ibmclose%>%diff()%>%ggPacf( col='darkorchid')

After differencing the data, both ACF and PACF seem to have a relatively random (read as white noise) progression. There are some spikes in both, buth they do not seem to reach a pattern. I would attampt to run this as a Arima(0,1,0) model and see how well it worked then tune from there.

8.6 Use R to simulate and plot some data from simple ARIMA models

(a.) Use the following R code to generate data from an AR(1) model with \(\phi = 0.6\) and \(\sigma^2 = 1\). The process starts with \(y_1 =0\)

(b.) Produce a time plot for the series. How does the plot change as you change \(\phi_1\)

 AR_one<-function(phi, sigma2){
     y=ts(numeric(100))
     e <- rnorm(n = 100, sd = sigma2)
     for (i in 2:100){
        y[i] <- phi*y[i-1] + e[i] 
     }
          return(y)
 }
AR_one(.6, 1)%>% autoplot(main='Random Series with phi=0.6', col='darkgreen')

The initial series appears to be randomly walking relative the the betaapplied to y and the added random error. Let’s look at a progression from 0 to 1 in steps of .2

Pretty Much White Noise (No discrnable obvious patterns)

Clearly as \(\phi\) becomes larger, the amount of variation from point to point becomes smaller. This would seem to be because we are preserving a larger portion of y-1 with the larger \(\phi\) weight, such that the addition of \(e\) is less influential than it would be if we preserved very little of \(y-1\) and applied \(e\). This would be considered a random walk series.

(c.) Write your own code to generate data from an MA(1) model with \(\phi_1=0.6\) and \(\sigma^2 = 1\)

 MA_one<-function(theta,sigma2){
     e <- rnorm(n = 100, sd = sigma2)
     y=ts(numeric(100))
     for (i in 2:100){
        y[i] <- theta*e[i-1] + e[i] 
     }
          return(y)
 }

MA_one(.6, 1)%>%
    autoplot(main = 'MA(1) model with Theta = .6' , col ='darkred')

This model too seems to exibit a white noise pattern without any discernable trends or cycles.

MA_one(.2, 1)%>%
    autoplot(main = 'MA(1) model with Theta = .2', col ='darkred')

MA_one(.8, 1)%>%
    autoplot(main = 'MA(1) model with Theta = .8', col ='darkred')

(d.) Produce a time plot for the series. How does the plot change as you change \(\phi_1\)

In these series with the error dependent on \(\sigma^2 = 1\) the sharpness of the change to graph of an MA(1) model increase as theta decreases, and becomes more smoothed as theta increases.

(e.) Generate data from an ARMA(1,1) model with \(\phi_1 = 0.6\), \(\theta_1 =0.6\) and \(\sigma^2 = 1\)

 ARMA_one_one<-function(theta, phi, sigma2){
     e <- rnorm(n = 100, sd = sigma2)
     y=ts(numeric(100))
     for (i in 2:100){
        y[i] <- theta*e[i-1] +phi*y[i-1] +e[i] 
     }
          return(y)
 }

ARMA_one_one(-.8, .3, 1)%>%
    autoplot(main='ARMA(1,1) with theta = .6 and phi = .6 ', col ='darkgoldenrod' )

Again this seems to show a white noise pattern with a contant mean and random variation.

(f.) Generate data from an AR(2) model with \(\phi_1 = -0.8\), \(\phi_2 =0.3\) and \(\sigma^2 = 1\) (Note that these parameters will give a non-stationary series.)

AR_2<-function(phi_1, phi_2, sigma2){
     y=ts(numeric(100))
     e <- rnorm(n = 100, sd = sigma2)
     for (i in 3:100){
        y[i] <- phi_1*y[i-1] + phi_2*y[i-2] + e[i] 
     }
          return(y)
}

AR_2(-.8, .3, 1)%>%
    autoplot(main = 'AR(2) with phi-one = -.8 and phi-two = .3', col = 'palevioletred2')

(g.) Graph the latter two series and compare them.

In the ARMA(1,1) where both \(\theta\) and \(\phi\) are .6 the series remains white noise-like. However, when we apply the two autoregressive lag coefficients, they compound each other making what would have been a random white noise series in the original y, now a multiplicative series. This makes sense because we are adding fractional components of the prior y and its prior y to each successive y, so they they will grow over time. Because \(\phi_1\) is negativethe values switch back and forthfrom poisitive to negative and then amplify in magnitude as the second AR component grows the y value at each consecutive time t.

8.8 Consider austa, the total international visitors to Australia (in millions) for the period 1980-2015.

(a.) Use auto.arima() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.

data(austa)
(fit <-auto.arima(austa))
## Series: austa 
## ARIMA(0,1,1) with drift 
## 
## Coefficients:
##          ma1   drift
##       0.3006  0.1735
## s.e.  0.1647  0.0390
## 
## sigma^2 estimated as 0.03376:  log likelihood=10.62
## AIC=-15.24   AICc=-14.46   BIC=-10.57
    residuals(fit)%>%
    checkresiduals()
## Warning in modeldf.default(object): Could not find appropriate degrees of
## freedom for this model.

fit%>%forecast(h=10) %>% autoplot()

Based on auto.arima() the results came back as a single-differenced MA(1) model with the equation \(y_t = c + y_{t-1} -.3006\epsilon_{t-1} \epsilon_t\)

(b.) Plot forecasts from an ARIMA(0,1,1) model with no drift and compare these to part a. Remove the MA term and plot again.

(fit_ma <- Arima(austa, order = c(0,1,1), include.mean = FALSE))
## Series: austa 
## ARIMA(0,1,1) 
## 
## Coefficients:
##          ma1
##       0.4936
## s.e.  0.1265
## 
## sigma^2 estimated as 0.04833:  log likelihood=3.73
## AIC=-3.45   AICc=-3.08   BIC=-0.34
fit_ma%>%
    autoplot()

(fit_noma <- Arima(austa, order = c(0,1,0), include.mean = FALSE))
## Series: austa 
## ARIMA(0,1,0) 
## 
## sigma^2 estimated as 0.06423:  log likelihood=-1.62
## AIC=5.24   AICc=5.36   BIC=6.8
fit_noma%>%
    autoplot()

With MA

forecast(fit_ma, h=10)%>%autoplot()  

Without MA

forecast(fit_noma, h=10)%>%autoplot()

(c.) Plot forecasts from an ARIMA(2,1,3) model with drift.

(fit_arima <- Arima(austa, order = c(2,1,3), include.drift = TRUE))
## Series: austa 
## ARIMA(2,1,3) with drift 
## 
## Coefficients:
##          ar1      ar2      ma1     ma2     ma3   drift
##       1.7648  -0.8513  -1.7684  0.5571  0.2224  0.1725
## s.e.  0.1136   0.1182   0.2433  0.4351  0.2150  0.0051
## 
## sigma^2 estimated as 0.0276:  log likelihood=13.72
## AIC=-13.44   AICc=-9.29   BIC=-2.55
fit_arima%>%
    autoplot()

fit_arima%>%
  forecast(h=10)%>%
  autoplot()

accuracy(forecast(fit_arima))
##                        ME      RMSE      MAE       MPE     MAPE      MASE
## Training set -0.004129819 0.1491068 0.114857 -1.203567 4.326425 0.5637093
##                     ACF1
## Training set -0.01736821

Remove the constant and see what happens.

(fit_arima <- Arima(austa, order = c(2,1,3),  include.constant = FALSE))
fit_arima%>%
    autoplot()

In the current state, the coefficients for the AR component breed non-stationarity, such that the model throws an error. Error in stats::arima(x = x, order = order, seasonal = seasonal, include.mean = include.mean, : non-stationary AR part from CSS

This can be corrected by switching to the traditional “Maximum Likelihood” method.

(fit_arima <- Arima(austa, order = c(2,1,3),  include.constant = FALSE, method='ML'))
## Series: austa 
## ARIMA(2,1,3) 
## 
## Coefficients:
##          ar1     ar2     ma1      ma2      ma3
##       0.0004  0.9996  0.4633  -0.9893  -0.4625
## s.e.  0.0031  0.0031  0.2283   0.0329   0.2261
## 
## sigma^2 estimated as 0.03568:  log likelihood=9.24
## AIC=-6.48   AICc=-3.48   BIC=2.85
fit_arima%>%
    autoplot()

fit_arima%>%
  forecast(h=10)%>%
  autoplot()

accuracy(forecast(fit_arima))
##                      ME      RMSE      MAE      MPE     MAPE      MASE
## Training set 0.03972287 0.1724413 0.140205 1.685829 4.593002 0.6881153
##                     ACF1
## Training set -0.07513453

This clearly causes the model to be less useful as the confidence interval widens greatly and the root units are moved to the far extremes of the acceptable stationary range (-1,0) and (1,0) for both the MA and AR components, suggesting that this is marginally stationary.

Including the constant term seems to improve the performance of the models and builds a model with a more compact prediction interval.

In order to compare apples to apples, I used the ‘ML’ method on the model with drift

(fit_arima <- Arima(austa, order = c(2,1,3),  include.drift = TRUE, method='ML'))
## Series: austa 
## ARIMA(2,1,3) with drift 
## 
## Coefficients:
##          ar1      ar2      ma1     ma2     ma3   drift
##       1.7648  -0.8513  -1.7684  0.5571  0.2224  0.1725
## s.e.  0.1136   0.1182   0.2433  0.4351  0.2150  0.0051
## 
## sigma^2 estimated as 0.0276:  log likelihood=13.72
## AIC=-13.44   AICc=-9.29   BIC=-2.55
fit_arima%>%
    autoplot()

fit_arima%>%
  forecast(h=10)%>%
  autoplot()

accuracy(forecast(fit_arima))
##                        ME      RMSE      MAE       MPE     MAPE      MASE
## Training set -0.004129819 0.1491068 0.114857 -1.203567 4.326425 0.5637093
##                     ACF1
## Training set -0.01736821

As expected, this is approximately the same as the original order = (2,1,3) model, but makes the comparison of the training error metrics more appropriate, and the AICc is considerably lower with the constant (and drift) than without, as is the forecasting RMSE and the resulting prediction interval. It simply makes sense to allow for drift.

(d.) Plot forecasts from an ARIMA(0,0,1) model with a constant. Remove the MA term and plot again.

(fit_ma <- Arima(austa, order = c(0,0,1), include.mean = TRUE))
## Series: austa 
## ARIMA(0,0,1) with non-zero mean 
## 
## Coefficients:
##          ma1    mean
##       1.0000  3.5515
## s.e.  0.0907  0.3090
## 
## sigma^2 estimated as 0.9347:  log likelihood=-50.64
## AIC=107.28   AICc=108.03   BIC=112.04
fit_ma%>%
    autoplot()

(fit_noma <- Arima(austa, order = c(0,0,0), include.mean = FALSE))
## Series: austa 
## ARIMA(0,0,0) with zero mean 
## 
## sigma^2 estimated as 15.72:  log likelihood=-100.67
## AIC=203.34   AICc=203.46   BIC=204.93
fit_noma%>%
    autoplot()

With MA

forecast(fit_ma, h=10)%>%autoplot()  

Without MA

forecast(fit_noma, h=10)%>%autoplot()

(e.) Plot forecasts from an ARIMA(0,2,1) model with no constant.

(fit_ma <- Arima(austa, order = c(0,2,1), include.mean = TRUE))
## Series: austa 
## ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.7263
## s.e.   0.2277
## 
## sigma^2 estimated as 0.03969:  log likelihood=6.74
## AIC=-9.48   AICc=-9.09   BIC=-6.43
fit_ma%>%
    autoplot()

forecast(fit_ma, h=10)%>%autoplot()